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Abstract 

Background: Qualitative frameworks, especially those based on the logical discrete formalism, are increasingly used 
to model regulatory and signalling networks. A major advantage of these frameworks is that they do not require 
precise quantitative data, and that they are well-suited for studies of large networks. While numerous groups have 
developed specific computational tools that provide original methods to analyse qualitative models, a standard 
format to exchange qualitative models has been missing. 

Results: We present the Systems Biology Markup Language (SBML) Qualitative Models Package ("qual"), an 
extension of the SBML Level 3 standard designed for computer representation of qualitative models of biological 
networks. We demonstrate the interoperability of models via SBML qual through the analysis of a specific signalling 
network by three independent software tools. Furthermore, the collective effort to define the SBML qual format 
paved the way for the development of LogicalModel, an open-source model library, which will facilitate the 
adoption of the format as well as the collaborative development of algorithms to analyse qualitative models. 

Conclusions: SBML qual allows the exchange of qualitative models among a number of complementary software 
tools. SBML qual has the potential to promote collaborative work on the development of novel computational 
approaches, as well as on the specification and the analysis of comprehensive qualitative models of regulatory and 
signalling networks. 
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Background 

Studies by S. Kauffman [1] and R. Thomas [2] founded the 
logical discrete approach to model biological molecular 
networks and analyse their behaviours. In these networks, 
components (e.g., genes or proteins) assume discrete values 
representing their activity levels (e.g., gene expression). 
Components are connected by directed edges that embody 
regulatory (causal) effects, forming an influence network. 
The activity level of each component evolves depending on 
the activity levels of the components influencing it. The 
rules that determine component activity levels are defined 
in terms of logical rules or functions, corresponding to the 
underlying biological/biochemical regulatory mechanisms. 
The dynamical behaviour of the network is then generated 
by evolving the component levels following a specific 
updating scheme (e.g., synchronous, asynchronous or 
stochastic). The dynamics can subsequently be represented 
in terms of a state transition graph, where the nodes repre- 
sent (discrete) states of the model, while the edges denote 
transitions between these states. 

While other mathematical frameworks, including differ- 
ential equations, can be used to model biological processes 
in great detail, the logical formalism is particularly suitable 
for the modelling of large networks for which precise 
kinetic data are not available. In fact, logical models have 
become increasingly popular. They have been recently 
used to model complex dynamical behaviours and provide 
insights into numerous biological systems, including gene 
regulatory networks (e.g., [3-6]), signal transduction (e.g., 
[7-14]), as well as cell cycle (e.g., [15-18]), in species 
ranging from bacteria and viruses (e.g., [3,19,20]) to yeast 
(e.g., [17,21-23]), flies (e.g., [24-26]), plants (e.g., [27,28]), 
and to humans (e.g., [11-13,18,29]). 

Often based on qualitative knowledge of regulatory 
mechanisms and published data, discrete models can be 
assembled through a "bottom-up" approach, whereby each 
logical function represents specific, biological interactions 
between the components of the network. Recently, "top- 
down" approaches have been applied to construct logical 
models by automatic inference from high-throughput ex- 
periments (e.g., [30]). 

Many simulation and analysis software tools for logical 
models already exist, including ADAM [31], BoolNet 
[32], BooleanNet [33], Cell Collective [34,35] CellNetA- 
nalyzer [36], CellNOpt [30], ChemChains [37], GINsim 
[38], Odefy [39], SimBoolNet [40], and SQUAD [41]. 

The state transition graphs describe the discrete dynam- 
ics of networks and thus embody their dynamical proper- 
ties. However, these graphs may quickly become too large 
and difficult to analyse. This has led several groups to 
propose the use of model-checking techniques [42] that 
explore, for instance, attractors (e.g., stable states or ter- 
minal cycles) and paths leading to them [43] . A number of 
logical modelling tools allow properties of the state 



transition graphs to be verified by means of existing 
model-checking tools, such as NuSMV [44-47]. The prop- 
erties are formulated in terms of temporal logic or in a 
suitable high-level query template capturing recurrent bio- 
logical questions [48]. The model checker tests if the 
state transition graph, which may be explicitly gener- 
ated or implicitly encoded in a symbolic description of 
the model, satisfies the property. For example, while 
GINsim exports symbolically encoded logical models 
into SMV files, BIOCHAM integrates NuSMV [47] 
providing an interface for the specification and verifica- 
tion of properties expressed in several temporal logics 
[46]. A detailed description of the use of model- 
checking techniques in the context of qualitative 
models of biological networks is outside the scope of 
this paper, but see reference [49] for a review and add- 
itional examples. 

Over the years, different formats have been developed 
to store logical models, ranging from simple text files 
containing truth tables and/ or logical functions to XML- 
based file formats. Standards such as the Systems Biology 
Markup Language (SBML [50]) or the Systems Biology 
Graphical Notation (SBGN [51]) have been developed to 
enable unified exchange of biological/biochemical molecu- 
lar maps. SBML supports process-based mathematical 
frameworks with a reaction-centred description of bio- 
chemical processes. Because the building blocks of qualita- 
tive models are fundamentally different from species and 
reactions used in (core) SBML models, previous attempts 
to represent logical models in SBML led to a distorted 
use of the standard. Indeed, variables in Boolean networks, 
logical models and some Petri nets represent discrete 
levels of activities rather than amounts/numbers of 
molecules. Moreover, simulation of logical models do 
not generally imply the notion of a continuous time. 
Consequently, the processes involving them cannot be 
described as reactions per se, but rather as transitions 
between states. 

SBML Level 3 is modular and thereby enables the 
development and inclusion of packages extending the 
core with additional features. Using this modular structure, 
we developed a novel Qualitative Models ("qual") package 
to support the standard definition and exchange of qualita- 
tive (discrete) models. 

It is worth noting that, although SBML qual is cur- 
rently mainly used for logical models, it was developed 
to support standard Petri nets as well, due to common- 
alities between the frameworks. While Petri nets are 
mostly used to study metabolic networks, they have also 
been employed to model regulatory and signalling 
networks (see reviews [52-54]). Until now, the Petri 
net community relied on specialised exchange formats 
(e.g., PNML, http://www.pnml.org) and simulation tools 
that support SBML core (e.g., [55]). 
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Given the open source nature of SBML qual and the 
collaborative nature of the SBML community, the new 
standard should be swiftly adopted and implemented in 
many existing tools supporting logical models and their 
relatives such as Petri nets. The cooperation on SBML 
qual further fostered synergistic efforts to articulate 
and improve existing tools, leading to the launching of 
the Common Logical Modelling Tools (CoLoMoTo) 
project (http://co.mbine.org/colomoto/), which gathers 
many groups developing and using logical modelling 
software tools. 

Methods 

Development of the qual package 

A draft proposal of a SBML package to encode qualitative 
models was initially proposed in 2008. Between 2008 and 
2012, the proposal was refined, through community con- 
sultations and dedicated meetings by developers of various 
related software tools, and in particular members of the 
CoLoMoTo project. In 2011, the proposal was accepted 
through a community vote. The final specification was ac- 
cepted by the SBML Editors in the spring of 2013. 

LibSBML & JSBML 

LibSBML is an application programming interface (API) 
library for reading, writing, manipulating and validating 
content expressed in the SBML format [56]. It is written 
in ISO C and C++, provides language bindings for .NET, 
Java, Python, Perl, Ruby, MATLAB and Octave, and 
includes many features that facilitate the adoption and 
use of both SBML and the libSBML. JSBML, a pure Java 
library for SBML, provides an API that maps all SBML 
elements to a flexible and extended Java type hierarchy 
whilst striving for 100% compatibility with the libSBML 
Java API [57]. As of November 2013, libSBML supports 
SBML qual in its public release (from version 5.9) while 
JSBML supports the package in its development branch 
and will include support in its next major release. 
LibSBML and JSBML are freely available as source code 
and binaries for all major operating systems under the 
LGPL open source terms (see http://sbml.org/Downloads). 
JSBML has been integrated in the LogicalModel library 
(see Results). 

Computer simulations 

To demonstrate the interoperability of models via SBML 
qual, we analysed a specific signalling network using 
three different software tools briefly described below. 

CellNOpt is an open-source software used for creating 
logic-based models of signal transduction networks [30]. 
CellNOpt consists of a set of R packages available in 
Bioconductor, which are also available via a Python 
wrapper, as well as a Cytoscape plug-in (CytoCopteR) 
which contains a SBML qual importer and exporter. 



CellNOpt converts a network (a signed, directed graph) 
into a scaffold of all possible models compatible with 
the network and subsequently trains this scaffold with 
data [58]. It includes a variety of formalisms: (i) Boolean 
models, simulated via synchronous update or by compu- 
tation of steady-states, (ii) semi-quantitative constrained 
Fuzzy logic, and (iii) ordinary differential equations 
(ODEs) derived from the logical model [30]. While the 
choice of a specific formalism depends on the data at 
hand, scope, and question, the followed workflow is 
similar. The network can be simplified by compressing 
nodes that are intermediates between perturbed or mea- 
sured nodes. Links impinging on nodes that are not 
observable (with no readout downstream) or not con- 
trollable (with no perturbation upstream of them) are 
also taken aside as their status cannot be derived from 
the data. 

CellNOpt generates logical models as hyper-graphs by 
adding all combinations of OR and AND gates that are 
compatible with the network (i.e., Sums of Products 
[59]). This leads to a hyper-graph representing a super- 
position of all Boolean models compatible with the ini- 
tial network. Subsequently, an optimisation procedure is 
applied to find the combination of gates and the param- 
eters that best explain the data, by minimising an object- 
ive function that quantifies the difference between data 
and simulation, while penalising model size. This pro- 
vides an optimum model or, more generally, a family of 
optimal models. Optimisation can be performed using a 
built-in genetic algorithm, or using external optimisation 
packages; in particular CellNOpt is connected to Meigo 
[60]. Furthermore, CellNOpt can leverage Answer Set 
Programming to efficiently find all possible Boolean 
models via the software package caspo [61]. 

Once an optimal model (or family of models) has been 
generated, it can be analysed in various ways. For example, 
it can be simulated to predict the outcome of new experi- 
ments [58]. One can also analyse the properties of a family 
of models, or compare models obtained for different 
cell types [62]. One can also identify missing links in 
the network using the module CNORFeeder [63]. The 
flexibility of the scripting languages (R, or Python) simpli- 
fies the writing of analysis workflows, and Cytocopter 
enables combined analysis with other Cytoscape tools 
and plug-ins. 

GINsim is a free (Java) software application devoted to 
the logical (multi- valued) modelling of regulatory and sig- 
nalling networks [38,64]. It provides a user-friendly graph- 
ical interface to define models from scratch. Models can 
also be imported from different formats. GINsim supports 
the simulation of logical models and generates the result- 
ing state transition graphs, considering a range of update 
policies (see below). GINsim also offers a number of func- 
tionalities to explore the dynamical properties of logical 
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models, some of which (e.g., determination of stable 
states) can be efficiently analysed without generating the 
complete network dynamics. 

It is well known that regulatory circuits (or feedback 
loops) can generate crucial dynamical properties [65]: 
positive circuits (encompassing an even number of inhi- 
bitions) produce multi-stability whereas negative circuits 
(encompassing an odd number of inhibitions) underlie 
stable oscillations. To help analyse these properties, 
GINsim identifies all the regulatory circuits embedded 
into a network and compute the regions of the state 
space, called functionality contexts, where they generate 
the related property (multi-stability versus oscillations). 

One can use various updating schemes to generate the 
dynamics of a logical model. When in a given state, sev- 
eral components are called to change their values, these 
updates can be done synchronously, asynchronously, or 
considering a priority scheme [15]. Under the synchron- 
ous scheme, all components are updated simultaneously, 
leading to one transition at most for each state and thus 
resulting in deterministic (linear) state transition se- 
quences. Under the asynchronous scheme, single com- 
ponent updates are considered separately, assuming that 
underlying delays are different but unknown; conse- 
quently, alternative trajectories are often generated, giv- 
ing rise to non-deterministic state transition graphs. 

Of particular interest is the asymptotical dynamical be- 
haviour of these models, which is captured by the notion of 
attractors. From a logical point of view, attractors take two 
forms: stable states, and terminal cyclic strongly-connected 
components (as defined in graph theory). Note that stable 
states and terminal elementary cycles (where in each state, 
a unique component is updated) are shared between syn- 
chronous and asynchronous updating schemes but this is 
not the case for other cyclic attractors. GINsim supports 
both the synchronous and asynchronous updating schemes, 
which can lead to rather distinct dynamical properties. In 
particular, asynchronous dynamics can be quite complex. 

In this respect, Hierarchical Transition Graphs (HTG) 
provide a compact and informative view of the dynamics 
in the form of a graph where nodes embody sets of states 
that are either irreversible (denoting irreversible sequences 
of states) or strongly connected (denoting oscillations in 
the form of transient or terminal complex components). 
For more details on HTG, see reference [66]. 

Finally, to handle the analysis of large models, several 
groups have devised reduction methods [10,58,67,68]. In 
this respect, the last (beta) version of GINsim allows users 
to get rid of (pseudo-) output species that do not regulate 
other nodes or regulate only pseudo-output nodes. This 
reduction has no impact on the number, nature and 
reachability of the attractors and it is particularly efficient 
for signalling networks as shown with our example model 
(see Results section). 



The Cell Collective is a web-based platform for the 
construction, simulation, and analysis of Boolean-based 
models [34,35]. The platform includes a Knowledge Base 
for users to annotate the models and keep track of ex- 
perimental research papers associated with each inter- 
action included in the model. Within the platform, 
models can be shared directly on the web or via down- 
load using the SBML qual format (as well as in the form 
of text files including the list of logical expressions, and 
as xsv files with truth tables). 

Models constructed in Cell Collective are Boolean 
(each species has a Boolean function associated with it, and 
assumes either an active or inactive state), and simulations 
can also include stochastic elements. Furthermore, data 
input/output from the analyses are continuous, providing a 
semi-quantitative measure to better match modelling results 
with laboratory experiments [14,37]. At the input level, this 
is accomplished by assigning a probability of being 
active in time t to each external species (i.e., those with no 
regulators), in contrast to classical Boolean simulations 
where each external species is fixed to 0 or 1. 

The activity level, or the probable active state of an 
output species, is measured by calculating the ratio of 0s 
and l's over the last n time steps (n is configurable to 
any discrete value, [35,37]); this ratio (multiplied by 100) 
provides the activity level on the j-axis (e.g., see Results; 
these parameters can be changed through the user inter- 
face). In the case of real-time simulations, as a single 
simulation evolves in time, the activity level of each spe- 
cies in the model is calculated as the ratio of 0s and Is 
within a predefined sliding window [35,37]. 

One of the assets of the Cell Collective is its user inter- 
face, which has been carefully designed to enable the con- 
struction of computational models in a non-technical 
fashion, in order to render modelling also amenable to 
non-modellers. That is, the construction of the models is 
based purely on qualitative knowledge about a particular 
regulatory mechanism (e.g., kinase X phosphorylates and 
activates species Y), without the need to manually enter 
Boolean expressions (these are created in the background 
based on the biological data provided [69]). Although 
creating relatively small Boolean models can be easily done 
by writing Boolean functions, defining models with species 
regulated by many regulators, or through complex 
regulatory mechanisms can often result in complex, 
nested functions (e.g., [14,29]), which can be cumbersome 
to define manually even for seasoned modellers. 

Results and discussion 

In this section, we present the SBML qual package and 
its validation by exchanging and interpreting a moder- 
ately complex signalling network model among three in- 
dependent software tools. In addition, we illustrate the 
interest of model exchange by applying complementary 
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simulation and analysis features. The section ends with a 
description of the LogicalModel library. 

The SBML qual package 

The SBML qual package extends the core SBML 
Level 3 standard, and enables standard exchange and 
interoperability of discrete models: logical models 
(Boolean and multilevel) and standard Petri nets. The full 
specification is available at http://identifiers.org/combine. 
specifications/sbml.level-3.version- 1 .qual.version- 1 .release- 1 
[70]. The structure of SBML qual is depicted in Figure 1. 
The rationale of the format relies on the features of qualita- 
tive models, with a (discrete) state space and event-driven 
state transition processes. 

The main elements of a SBML qual document are 
QualitativeSpecies, representing the entities of the model 



Model 



as the molecular components of the network, and Tran- 
sitions that contain the rules defining the state of given 
species at each iteration step. Here, the term transition 
is reminiscent of the Petri net terminology where a tran- 
sition is enabled according to the state of its input 
places, and its firing modifies the state of its input and 
output places. For logical models, a transition defines 
the logical rule associated with a network component. 

Each QualitativeSpecies assumes a discrete value (e.g., 
0 or 1 for the Boolean case), and its definition bears an 
attribute initialLevel that specifies the value(s) at the be- 
ginning of the simulation, and an attribute maxLevel 
that specifies the maximal level allowed. For instance, 
maxLevel would be 1 in a Boolean model. As for the 
Species of SBML Core, a QualitativeSpecies is associated 
with a compartment. 



QualitativeSpecies 



id: Sid 

constant: boolean 
initialLevel: non-negative int 



Transition 



Input 



qualitativeSpecies: SldRef 
transition Effect: transition I nputEffect 
thresholdLevel: non-negative int { use- "optional" } 



AA 



Output 



qualitativeSpecies: SldRef 
transition Effect: transitionOutputEffect 
outputLevel: non-negative int { use- "optional" } 



ListOfFunctionTerms 



defaultTerm 



DefaultTerm 



resultLevel: non-negative int { use="required" } 



functionTerm 0..* 



FunctionTerm 



resultLevel: non-negative int { use="required" } 



math 



a a denotes listOf_elements missing : 

(unecessary for basic outline) : 


Math 


xmlns: string {"http://www.w3.org/1998/Math/MathML"} 
MathML content evaluating to a boolean result. 





Figure 1 Simplified UML diagram for the SBML L3 Qualitative Models package (SBML qual). The QualitativeSpecies represent the entities 
involved in the model. These are referenced as either Inputs or Outputs of the Transition element. A Transition describes the way the level of each 
QualitativeSpecies may be altered depending on the levels of other entities in the model (a complete UML diagram can be found in [70]). 
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A Transition comprises Inputs (QualitativeSpecies men- 
tioned in the logical function), Outputs (species whose 
values at iteration t + 1 are determined by the logical rules 
evaluated at t) and FunctionTerms, containing conditions, 
as well as the values that the Output species will assume 
at t+1, whenever a given condition is met. At each iter- 
ation, all FunctionTerms within a Transition are evaluated. 
The term evaluating to true dictates the resulting state 
and the Output species are updated accordingly at t + 1. 

Each member of the ListOfFunctionTerms associated 
with a Transition contains a mathematical expression that 
returns a Boolean, as well as a resultLevel that indicates 
the level to be applied to the Outputs when this expres- 
sion evaluates to true. A defaultTerm is also defined to 
establish the result when none of the FunctionTerms 
apply. The combined set of defaultTerm together with the 
list of FunctionTerms establish the state transitions for the 
entities involved. Figure 2 provides an illustration of a 
simple Boolean model encoded in SBML qual. 



Demonstration of model interoperability 

As part of the SBML extension development and approval 
process (by the SBML Editors) is the requirement that at 
least two independent software tools fully implement the 
proposed package. CellNOpt, GINsim, and the Cell 
Collective have been recently registered with the SBML 
community as tools currently supporting the SBML qual 
package. These tools have been further used to demonstrate 
how a logical model can be handled with different software 
tools using SBML qual as an exchange format. 

As the three aforementioned software tools can pro- 
vide different perspectives on the dynamics of discrete/ 
logical models, this section is organised to demonstrate 
their complementarity. More specifically, we present a 
conceptual pipeline that enables scientists to derive a 
discrete model from high-throughput data, conduct 
thorough analyses, and ultimately use the model to fur- 
ther guide experiments. CellNOpt is used to derive a 
logical model via a top-down approach, exploiting 




^4 



3 qualitative species (A, B, C) 



Transition tr C 



qual:id= 
qual:id= 



tr_C_in_0' 
tr_C_in_l' 




"^^f with two inputs (A, B) 



and one output (C), 
which is assigned a 
value upon tr_C effect 



The default and function 
terms defining the value 
assigned to C 



<?xml version='1.0' encoding='UTF-8' standalone='no' ?> 

<sbml xmlns="http://www.sbml .org/sbml/level3/versionl/core" qual : required="true" level="3" 

xmlns:qual=" http://www.sbml .org/sbml/level3/versionl/qual/versionl" version 
<model id=" example "> 
<listOfCompartments> 

<compartment id="compl" constant="true"/> 
</listOfCompartments> 

<qual :listOf QualitativeSpecies xmlns:qual="http://www.sbml .org/sbml/level3/versionl/qual/versionl' 

<qual :qualitativeSpecies qual : compartment="compl" qual :maxLevel="l" qual : id="A"/> 

<qual :qualitativeSpecies qual : compartment="compl" qual :maxLevel="l" qual : id="B"/> 

<qual :qualitativeSpecies qual : compartment="compl" qual :maxLevel="l" qual : id="C"/> 
</qual : listOfQualitativeSpecies> 

<qual :listOfTransitions xmlns :qual="http://www. sbml .org/sbml/level3/versionl/qual/versionl"> 
<qual : transition qual : id="tr_C"> 
<qual : listOf Inputs> 

<qual:input qual :transitionEffect="none" qual : sign="positive" qual :qualitativeSpecies= 
<qual:input qual :transitionEffect="none" qual : sign="negative" qual :qualitativeSpecies= 
</qual : listOf Inputs> 
<qual : Iist0f0utputs> 

<qual : output qual :transitionEffect="assignmentLevel" qual :qualitativeSpecies="C" qual : id="tr_C_out"/> 
</qual : Iist0f0utputs> 
<qual : listOf FunctionTerms> 

<qual : defaultTerm qual : resultLevel="0"/> C is Set to 0 by default 

<qual : functionTerm qual : resultLevel="l"> 

<math xmlns="http://www.w3.org/1998/Math/MathML"> C is Set to 1 if the following 

<appiy> condition is fulfilled 

<and/> 
<apply> 

Zt\ </ci> A=1 AND B=0 

<cn type="integer"> 1 </cn> 
</apply: 
<apply: 
<eq/: 

<ci> s_B </ci> 
<cn type="integer"> 0 </cn> 
</apply: 
</apply> 
</math> 
</qual : f unctionTerm> 
</qual : listOf FunctionTerms> 
</qual : transition> 
</qual : listOfTransitions> 
</model> 
</sbml> 

Figure 2 A simple Boolean network encoded in SBML qual; the three network components are Boolean, whereby C is activated by A 
and inhibited by B. These regulatory effects are embodied in the Transition element (tr_C), which has 2 inputs (A and B, which levels are not 
modified by the transition), one output (C, whose assigned level is defined in the UstOfFunctionTerms of the transition). The defaultTerm is set to 0, 
while its sole functionTerm specifies (in the form of a MathML element) that C is 1 when A = 1 and B = 0. Note that the SBML format is intended 
to be readable by computers only; hence the code presented in this figure is for illustration purposes only. 
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experimental high-throughput data and an initial qualita- 
tive description of a signalling network (see Methods). 
Inferred models are subsequently simulated and analysed 
using additional techniques implemented in GINsim and 
the Cell Collective. 

Generation of a EGF/TNFa discrete model with CellNOpt 

The focus of CellNOpt is to utilise experimental data to 
generate logical models based on prior knowledge on 
signalling pathways (Prior Knowledge Networks, PKNs). 
The example model used herein is based on a PKN that 
combines two important mammalian signalling pathways 
induced by the Epidermal Growth Factor (EGF) and 
Tumour Necrosis Factor alpha (TNFa). EGF and TNFa 
ligands stimulate ERK, JNK and p38 MAPK cascades, 
the PI3K/AKT pathways, and the NFkB cascade. In 
addition, the network encompasses cross -talks between 
these pathways, as well as two negative feedback loops: 
one in the NFkB cascade and one in the MAPK cascade. 
Note that this network was previously used in [71] to illus- 
trate a variety of logical modelling approaches using syn- 
thetic data. Here, however, we slightly modified this PKN 
by adding an autocatalytic feedback loop on the phosphat- 
ase (ph) regulating the activation of SOS-1 (Figure 3). 



The PKN was subsequently trained using the syn- 
chronous update Boolean simulation, in combination 
with the CNORFeeder package to obtain the optimal lo- 
gical model (see Methods) used as an example in this 
paper (Figure 3). Instead of using experimental data, an 
ordinary differentiation equation (ODE) model repre- 
senting the "true network" was employed to generate the 
data and train the PKN. These data (in the form of time 
series) were thus obtained by simulating the ODE model 
upon stimulation of EGF and TNFa, and inhibition of 
PI3K and Raf-1 in different combinations. The readout 
nodes (i.e., the proteins whose activities were measured 
upon stimulation) are highlighted in Figure 3. To reflect 
imprecisions in our knowledge of biological pathways, 
the topology of the data generator model ("golden stand- 
ard") is slightly different from the PKN. More precisely, 
a link from Map3K7 to MKK7 has been omitted in the 
PKN, to which an extra edge from PI3K to Map3Kl was 
further added. A workflow with CellNOpt was able to 
recover the "golden standard" model from this PKN and 
the experimental data (see http://www.cellnopt.org/doc/ 
cnodocs/examples_sbml.html for more detail). This final 
model was then exported to SBML qual and simulated 
and analysed using all three tools. 
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TNFa 










)) — 


EGFR 




TNFR 




Figure 3 Boolean model obtained by CellNOpt and visualised using the Activity Flow language of the Systems Biology Graphical 
Notation [51] and drawn with CySBGN [72]. Different colours define the experimental design of the data used to train the model: (i) green 
boxes denote external stimuli, (ii) red boxes correspond to species blocked by kinase inhibitors, and (iii) blue boxes denote species that were 
measured (readouts). 
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Dynamical cross-validation of the model 

Dynamical (synchronous) simulations of the EGF/TNFa 
network for the four different initial conditions gave 
consistent results with the three software tools. In 
this respect, Figure 4 shows the consistent global state 
evolution, attractor reachability, as well as temporal 
evolution of selected nodes for two of these conditions 
(see Additional file 1 for a full set of simulation results). 
Depending on the initial condition, simulations result 
in one of two stable states or in one of two cycles 
encompassing six states. 

Model analyses in GINsim 

As mentioned in the previous section, starting from the 
null state, simulations using synchronous updating of 
the example model result in a unique attractor for 
each of the four combinations of the two external inputs 
(EGF and TNFa), either a stable state or a simple terminal 
cycle (Additional file 1). Although we expect to get the 
same attractors under the asynchronous update, their 
reachability may be affected. Moreover, the number of 
states possibly visited before reaching an attractor may 
greatly differ between synchronous and asynchronous 



simulations, as well as the characteristics of the transient 
dynamics. 

For example, when EGF = 1 and TNFa = 1, the asyn- 
chronous state transition graph is substantially larger with 
about 116 k states, as opposed to the 19 states obtained 
with a synchronous update (Figure 4B). To contain the 
size of the state transition graph, we can reduce the model 
by removing all (pseudo-) output nodes (cf. Methods). 
Applying this reduction and thereby eliminating 11 nodes 
as (pseudo-) outputs results in a significant reduction of 
the state transition graph (down to 546 states, Figure 5A). 
Importantly, the unique reachable attractor is identical to 
that obtained with the synchronous update. The resulting 
HTG (Figure 5A) has a peculiar staged structure with a 
series of irreversible components (labeled i# followed by 
the number of states included) that the system may leave 
to undergo transient oscillations (labeled #ct), to eventually 
reach the final cyclic attractor (labelled #ca). Interestingly, 
by defining priority classes and imposing that IKK update 
is slower than all other species, one can get rid of all these 
transient oscillations (Figure 5B). 

To maintain input components (EGF and TNFa) con- 
stant, implicit self-activations are defined. These two 
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Figure 4 Dynamic profile of the EGF/TNFa model. The model was simulated with consistent results in the Cell Collective (on the left of panels 
A & B), GINsim (on the right of panels A & B), and CellNOpt (data not shown, but simulations were consistent with those presented here). 
Synchronous simulations were performed under four input conditions: (i) EGF = TNFa = 0; (ii) EGF = TNFa = 1 ; (iii) EGF = 0 & TNFa = 1 ; (iv) EGF = 1 
&TNFa = 0. Results for conditions in (i) and (ii) are presented in this figure (the remaining two can be found in Additional file 1). Charts on the 
top left of each panel correspond to the overall dynamic profile across all nodes in the model. Black cells correspond to active (1) states, whereas 
inactive (0) states are white. The bottom graphs at the bottom left illustrate the time course of selected nodes. The GINsim columns show State 
Transition Graph and the Hierarchical Transition Graph (HTG) generated with the tool. Note that due to the synchronous updating, the irreversible 
components of the (HTG) correspond to linear chains of states. A) EGF = TNFa = 0. The network reaches a steady state (shown with both GINsim 
and the Cell Collective) after 3 transient states. The order of the individual species states in the steady state generated by GINsim is sorted in the 
same (alphabetical) order, as presented on the Cell Collective side. B) EGF = TNFa = 1. After 12 transient states, the network reaches a cyclical 
attractor encompassing six states. Note that in order to simulate the example model in the Cell Collective as a traditional Boolean network 
(i.e., with binary input/output), the external species were set to 100 or 0, and the sliding window was set to 1 (see Methods). 
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Asynchronous Hierarchical Transition Graph 
from the null state, with egf=tnfa=1, 
reducing all (pseudo-)outputs 
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Figure 5 Properties of the EGF/TNFa model analysed with GINsim. A) The Hierarchical Transition Graph (HTG) representing the dynamics 
of the reduced model (i.e., [pseudo-] outputs removed), under the asynchronous scheme, starting from the null initial state and the EGF= 1 & 
TNFa = 1 condition. The HTG shows the organisation of the dynamics with a chain of 13 irreversible sets (grey nodes, each encompassing 36 
states) connected, at each stage, to a chain of 12 transient cycles (light blue nodes, each encompassing six states) and a unique cyclic attractor 
(pink node, also encompassing six states). B) By defining a lower priority for the update of IKK, all transition states towards transient cycles 
(in light blue in panel A) are prevented: the system reaches the cyclic attractor without visiting the same state twice. On the right of the panel 
B, the corresponding State Transition Graph (STG) starting from the initial state (contained in the HTG state set in green) and leading to the cyclic 
attractor (pink nodes). This STG is shown to illustrate the complexity of the transient dynamics and is not meant to be readable. C) The HTG 
showing that, under the asynchronous scheme, different attractors are reachable. Here the two cyclic attractors differ by the value of ph 
(arrows in bold embody transitions increasing the value of ph). D) Stable states for the wild-type, IKK knock-out and ectopic expression of 
ERK, respectively. 



(functional) positive circuits explain the presence of at 
least four attractors; the combinations of input values de- 
fine a partition of the state space in four disconnected re- 
gions. Using GINsim, we can verify that the EGF/TNFa 
model encompasses two additional functional circuits: a 
three-element negative circuit involving IkB, NFkB and ex, 
and a positive auto- regulatory circuit on ph. The function- 
ality context of the negative circuit corresponds to IKK = 1 
(which is the case when TNFa is 1). This negative circuit 
enables the attractors where IkB, NFkB and ex oscillate. 
The functionality context of the positive auto-regulatory 
circuit is defined by ERK = 0. This circuit explains the 
presence of two attractors when EFG = 0. 

For the input configuration where EGF = 0 and TNFa = 1, 
starting from an initial state with ERK = 1, under the 



asynchronous update, the system is able to reach two cyclic 
attractors differing by the presence of ph (Figure 5C). 
Trajectories leading to the terminal cycle with ph = 0 
are discarded by the synchronous update in which the 
decrease of ERK (in the absence of EGF and thus MEK1) 
occurs together with the increase of ph, already in the first 
step of the simulation, leading to the cyclic attractor 
where ph = 1. 

Using GINsim, common perturbations such as gene 
knock-outs or ectopic gene expressions, as well as their 
effects, can be easily simulated. For instance, knocking 
out IKK eliminates oscillations (that were present under 
wild-type simulations when TNFa = 1), as a result of the 
interruption of the Ii<B-NFi<B-ex negative circuit 
(Figure 5D). Similarly, the simulation of ERK ectopic 
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expression, disrupting the functionality context of ph 
auto-regulation, leads to the loss of the multi-stability 
when EGF = 0 (Figure 5C). 

Simulations and biological application with the cell 
collective 

The Cell Collective platform aims at facilitating collab- 
orative modelling for experimental scientists. The plat- 
form offers two modes of simulations: input-output 
dynamical analyses across hundreds of simulated envi- 
ronments, along with real-time, interactive simulations. 
Examples of simulations of the EGF/TNFa model with 
the Cell Collective are illustrated in Figure 6. 

Figure 6A-C illustrates simulations of input-output re- 
lationships between external species stimulating the net- 
work and species of interest regulated in response to 
this stimulation. Laboratory studies to identify functional 
relationships between extracellular stimuli and various 



cellular components are often expensive and resource 
consuming. The dynamical analysis tool implemented in the 
Cell Collective allows users to conduct in silico experiments 
mimicking laboratory experiments, with the advantage that 
researchers can simulate hundreds or thousands of extracel- 
lular and/or disease-related situations (as opposed to the 
limited number of scenarios that can be reasonably handled 
in the laboratory) and generate rich input-output relation- 
ships (i.e., dose-response curves) between network stimuli 
and any species in the network. In this respect, inputs and 
outputs can take continuous values on a scale from 0 to 100 
(see Methods), despite the discrete (Boolean) nature 
of the network model. For example, Figure 6A shows 
a dose-response curve and a positive correlation between 
EGF and Akt. In contrast, inhibition of PI3K results in the 
loss of EGF-dependent activation of Akt (Figure 6B). 
Finally, the input-output relationship between TNFa 
and IkB is illustrated in Figure 6C. 
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Figure 6 Examples of simulations in the Cell Collective. In panels A-C, the model was simulated 100 times, 800 time steps each. For each of 
the 100 simulations, an activity level between 0 and 100 (i.e., probability of being active at time f) was randomly selected for EGF (panels A and 
B) and TNFa (panel C). A) Dose-response curve illustrating the activation of Akt under changing levels of EGF. B) Inhibition of PI3K results in the 
loss of EGF-induced activation of Akt. C) Dose-response curve illustrating inactivation of IkB in response to increasing levels of TNFa. D-F) 
Real-time simulation under varying conditions: D) Setting EGF to 50% (using the sliders illustrated above the plot) results in an intermediate activation 
of Akt, and transient activation of Ras and Erk. E) Simulated Ras gain-of-function (introduced around time step 50), results in the activation of Ras, which 
subsequently stimulates Erk. Akt remains active at around 50%. F) The removal of EGF (turning it to 0%) results in the decrease of Akt activity, while Erk 
continues to rise due to Ras mutation. Any and all species in the model can be displayed during the real-time simulations; the three species Akt, Erk, 
and Ras were selected for illustration purposes only. Note that the activation levels do not correspond to concentrations or any molecular 
measurements; they rather provide a semi-quantitative activity measure to analyse the effects of changes in the model (e.g., perturbations) on 
the rest of the network. 
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Real-time interactive simulations with the Cell Collective 
enable users to interactively change the environment during 
the simulations. This tool enables users to test "what-if ' sce- 
narios, e.g., changes in the external conditions, as well as of 
(transient) gain/loss -of-function, with instant feedback in 
terms of the changing activity levels of affected species. To 
illustrate the utility of this mode, we simulated the EGF/ 
TNFa model under a condition where EGF was set to a 
medium activity level, while keeping TNFa inactive. This 
condition results in the activation of Akt, Erk, and Ras 
(Figure 6D). The simulation of a Ras gain-of-function results 
in further activation of Erk, but not of Akt (Figure 6E). In 
contrast, Akt continues to respond to EGF activation and 
deactivation (Figure 6F and G, respectively). This is because 
Akt (unlike Erk) does not lie downstream of Ras (Figure 2) 
and hence is not affected by the constitutively activated Ras. 

The logical model library 

In order to ease the adoption of the new standard, an open 
source (Java) library, LogicalModel has been created. The 
library can be used as a standalone command line tool for 
model conversion, and can be accessed at https://github. 
com/colomoto/logicalmodel. It provides a data structure to 
manipulate logical models, as well as a set of analytic tools 
(e.g., stable state identification, model reduction) that are 
common to many scientific efforts relying upon a discrete 
modelling approach. The library further provides import and 
export filters for SBML qual (through JSBML; see Methods 
section), as well as an interface enabling the integration of 
logical models and SBML qual with additional formats and a 
number of existing software tools. The development of this 
library coincides with the onset of the CoLoMoTo initiative. 

Figure 7 illustrates the central role of the LogicalModel 
library and main current model exchange capabilities of 



popular software tools, covering logical models as well as 
additional related qualitative modelling frameworks. 
Included are importers that have been recently developed 
to generate (non parameterised) qualitative models from 
pathway databases: KEG G translator [73] and Path2Models 
project [74]. 

Conclusions 

In order to enable the interoperability of qualitative, 
discrete models, a standard exchange format was necessary. 
While the previous versions of SBML were not fully com- 
patible with qualitative modelling approaches, the modular 
structure of SBML release (Level 3) enables the develop- 
ment of additional packages to support novel modelling 
frameworks and capabilities. 

In this paper, we report on the SBML Level 3 Qualitative 
Model (SBML qual) package, which provides a standard 
means for the exchange of logical models of regulatory 
and signalling networks. Currently, at least three software 
tools support both import and export via SBML qual 
(GINsim, CellNOpt, and the Cell Collective), while other 
tools such as GNA [75] and CellNetAnalyzer can export 
models to this format. The former three tools have been 
used here to demonstrate the consistency of the standard 
via simulations and analyses of a Boolean model of 
EGF/TNFa signal transduction pathways. The combined 
use of software tools is now facilitated, providing model- 
lers with a range of complementary means to investigate 
their models. 

Repositories of models encoded in SBML qual are 
already being prepared. For instance, the Cell Collective 
now contains numerous previously published logical 
models that can be downloaded. BioModels Database is a 
repository of computational models of biological networks 
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[76] that currently hosts several curated SBML qual 
models (http://www.ebi.ac.uk/biomodels/). 

SBML qual will continue to be refined by the commu- 
nity. Some of the improvements discussed so far include, 
for example, the definition of models where parameters 
are not (all) instantiated, models for which timing con- 
straints (or rates) are specified, extended Petri nets, etc. 
In addition, further integration with SBML Core con- 
cepts is planned. In particular, SBML qual will be useful 
to so-called hybrid formalisms, which combine features 
of both discrete and continuous models. A typical ex- 
ample are formalisms embedding a logical representa- 
tion of the interaction structure of the network into a 
continuous model of its dynamics, such as piecewise- 
linear differential equation models [77], hybrid autom- 
ata [78,79], or even fully continuous models in which 
the logical functions have been replaced by sigmoidal 
functions preserving the logic of the interactions [80]. 
Other hybrid formalisms that have been used for the 
modelling of regulatory and signalling networks are 
fuzzy logic-based models [81] and timed automata 
[82,83]. Software tools enabling the modelling, simula- 
tion, and analysis of networks by means of different 
kinds of hybrid models include CellNOpt, Odefy [39], 
SQUAD [41], GNA [75], and Q2LM [84]. Most of the 
these tools support SBML Core. 

Last but not least, to further support data and result re- 
producibility, the standardisation of algorithms and simu- 
lation schemes and parameters for qualitative models is 
planned by adopting the MIASE (Minimum Information 
About a Simulation Experiment) guidelines [85]. First 
steps in this direction have already been taken by adding 
simulation algorithms relevant to logical models to the 
Kinetic Simulation Algorithm Ontology (KiSAO [86]), 
allowing qualitative models to be used in simulations de- 
scribed in the Simulation Experiment Description Markup 
Language (SED-ML [87]). 

The availability of SBML qual and the inception of the 
CoLoMoTo consortium should foster the collaborative 
development of standards (including the extension of 
existing ones), as well as of computational methods for 
the qualitative modelling of biological networks. In this 
respect, anyone interested in these efforts is cordially in- 
vited to enter into contact with the existing community 
at sbml-qual@lists.sourceforge.net. 

Additional file 



white. The bottom graphs in the Cell Collective column illustrate the 
time course of selected nodes. The GINsim columns show State 
Transition Graph and the Hierarchical Transition Graph (HTG) generated 
with the tool. Note that due to the synchronous updating, the 
irreversible components of the (HTG) correspond to linear chains of 
states. Simulations were performed under four input conditions: 
EGF = TNFa = 0; EGF = TNFa = 1 ; EGF = 0 & TNFa = 1 ; EGF = 1 & TNFa = 0. 
A) EGF = TNFa = 0. The network reaches a steady state (shown in both 
GINsim and the Cell Collective column) after 3 transient states. The order 
of the individual species states in the steady state generated by GINsim is 
sorted in the same (alphabetical) order, as presented in the Cell 
Collective column. B) EGF = 1 , TNFa = 0. The network reaches a steady 
state after 14 transient states. C) EGF = 0, TNFa = 1. Following 5 transient 
states, the network reaches a 6-cycle attractor. D) EGF = TNFa = 1. After 
12 transient states, the network reaches a cyclical attractor encompassing 
six states. 
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Additional file 1: Complete dynamic profile of the example model. 

The model was simulated with consistent results in the Cell Collective 
(left column of panels A & B), GINsim (right column of panels A & B), and 
CellNOpt (data not shown, but simulations were consistent with those 
presented here). Charts at the top of the Cell Collective column 
correspond to the overall dynamic profile across all nodes in the model. 
Black cells correspond to active (1) states, whereas inactive (0) states are 
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